home *** CD-ROM | disk | FTP | other *** search
- 1. Input Data Formats;
- a. Each pattern must have inputs followed by 0 or more
- outputs. Therefore, training data files will work.
- b. Training data for classification typically has N features
- followed by the class id.
- c. Training data for mapping typically has N
- features followed by several desired output values.
-
- 2. Output Data Format;
- Output files from clustering include the number
- of clusters, followed by the cluster vectors themselves.
-
- 3. Conventional Clustering;
- a. Cluster a data file using Sequential Leader or
- K-Means Clustering.
- b. Desired outputs, if any, can be ignored.
-
- 4. Self-Organizing Map;
- a. Cluster a data file using Kohonen's Self-Organizing
- Feature Map.
- b. Desired outputs, if any, can be ignored.
-
- 5. Error Function
-
- The error function that is being minimized during K-Means
- clustering and self-organizing map training is
-
- N
- MSE = (1/Npat) SUM MSE(k) where
- k=1
-
- Npat 2
- MSE(k) = SUM [ x(p,k) - m(i(p),k ] ,
- p=1
-
- Npat is the number of training patterns, N is the number
- of inputs per pattern, x(p,k) is the kth input sample from the
- pth pattern, m(i,k) is the kth sample from the ith cluster, and
- i(p) is the index of the cluster to which the pth pattern
- belongs